Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 15932992 |
| Missing cells | 43472692 |
| Missing cells (%) | 22.7% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 9.3 GiB |
| Average record size in memory | 628.6 B |
Variable types
| CAT | 10 |
|---|---|
| NUM | 2 |
Reproduction
| Analysis started | 2020-02-24 20:19:54.075962 |
|---|---|
| Analysis finished | 2020-02-24 21:37:01.836531 |
| Version | pandas-profiling v2.5.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
user_id has a high cardinality: 730803 distinct values | High cardinality |
session_id has a high cardinality: 910683 distinct values | High cardinality |
reference has a high cardinality: 400277 distinct values | High cardinality |
platform has a high cardinality: 55 distinct values | High cardinality |
city has a high cardinality: 34752 distinct values | High cardinality |
current_filters has a high cardinality: 61980 distinct values | High cardinality |
impressions has a high cardinality: 1059891 distinct values | High cardinality |
prices has a high cardinality: 1066775 distinct values | High cardinality |
current_filters has 14779880 (92.8%) missing values | Missing |
impressions has 14346406 (90.0%) missing values | Missing |
prices has 14346406 (90.0%) missing values | Missing |
| Distinct count | 730803 |
|---|---|
| Unique (%) | 4.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 121.6 MiB |
| 6JWWFFNUMY6Y | 6230 |
|---|---|
| 0H73OEP6Z71O | 4084 |
| 7K4V4W05S7X7 | 4077 |
| SX3I42SKZEVH | 3876 |
| Q46K4RJHTQFR | 3810 |
| Other values (730798) |
| Value | Count | Frequency (%) | |
| 6JWWFFNUMY6Y | 6230 | < 0.1% | |
| 0H73OEP6Z71O | 4084 | < 0.1% | |
| 7K4V4W05S7X7 | 4077 | < 0.1% | |
| SX3I42SKZEVH | 3876 | < 0.1% | |
| Q46K4RJHTQFR | 3810 | < 0.1% | |
| G7U04A2HQFSG | 3607 | < 0.1% | |
| M8E88OK4G3IE | 3416 | < 0.1% | |
| A5ZFRVCM2Z1L | 3358 | < 0.1% | |
| EQKV6819ZD7M | 3320 | < 0.1% | |
| CQM1034RBOZI | 3141 | < 0.1% | |
| Other values (730793) | 15894073 | 99.8% |
Length
| Max length | 12 |
|---|---|
| Mean length | 12 |
| Min length | 12 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 26 | 72.2% | |
| Decimal_Number | 10 | 27.8% |
| Value | Count | Frequency (%) | |
| Latin | 26 | 72.2% | |
| Common | 10 | 27.8% |
| Value | Count | Frequency (%) | |
| ASCII | 36 | 100.0% |
| Distinct count | 910683 |
|---|---|
| Unique (%) | 5.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 121.6 MiB |
| 3167404ed3197 | 3522 |
|---|---|
| 948641e533837 | 2816 |
| 9233fb83c116b | 2800 |
| 191ae48e3cb8e | 2648 |
| c9b863c921a2d | 2640 |
| Other values (910678) |
| Value | Count | Frequency (%) | |
| 3167404ed3197 | 3522 | < 0.1% | |
| 948641e533837 | 2816 | < 0.1% | |
| 9233fb83c116b | 2800 | < 0.1% | |
| 191ae48e3cb8e | 2648 | < 0.1% | |
| c9b863c921a2d | 2640 | < 0.1% | |
| c4dc91b78ded1 | 2518 | < 0.1% | |
| 4c8e1e29b93fc | 2340 | < 0.1% | |
| b34847506ba7f | 2310 | < 0.1% | |
| 58a263c18b945 | 2219 | < 0.1% | |
| e9a8f4e36ea10 | 2216 | < 0.1% | |
| Other values (910673) | 15906963 | 99.8% |
Length
| Max length | 13 |
|---|---|
| Mean length | 13 |
| Min length | 13 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 22 | 68.8% | |
| Decimal_Number | 10 | 31.2% |
| Value | Count | Frequency (%) | |
| Latin | 22 | 68.8% | |
| Common | 10 | 31.2% |
| Value | Count | Frequency (%) | |
| ASCII | 32 | 100.0% |
timestamp
Real number (ℝ≥0)
| Distinct count | 518048 |
|---|---|
| Unique (%) | 3.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1541304041.0163977 |
|---|---|
| Minimum | 1541030408 |
| Maximum | 1541548799 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 121.6 MiB |
Quantile statistics
| Minimum | 1541030408 |
|---|---|
| 5-th percentile | 1541068577 |
| Q1 | 1541173676 |
| median | 1541319766 |
| Q3 | 1541436748 |
| 95-th percentile | 1541529510 |
| Maximum | 1541548799 |
| Range | 518391 |
| Interquartile range (IQR) | 263072 |
Descriptive statistics
| Standard deviation | 150309.1017 |
|---|---|
| Coefficient of variation (CV) | 9.752073421e-05 |
| Kurtosis | -1.220562807 |
| Mean | 1541304041 |
| Median Absolute Deviation (MAD) | 131043.0426 |
| Skewness | -0.1072420208 |
| Sum | 2.455758496e+16 |
| Variance | 2.259282606e+10 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.54103041e+09 1.54103045e+09 1.54103050e+09 1.54103057e+09 1.54103057e+09 ... 1.54154870e+09 1.54154870e+09 1.54154873e+09 1.54154873e+09 1.54154880e+09], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 1541443523 | 172 | < 0.1% | |
| 1541449443 | 146 | < 0.1% | |
| 1541362622 | 144 | < 0.1% | |
| 1541536100 | 141 | < 0.1% | |
| 1541440488 | 139 | < 0.1% | |
| 1541449515 | 136 | < 0.1% | |
| 1541364764 | 135 | < 0.1% | |
| 1541275821 | 134 | < 0.1% | |
| 1541546026 | 133 | < 0.1% | |
| 1541444738 | 133 | < 0.1% | |
| Other values (518038) | 15931579 | > 99.9% |
| Value | Count | Frequency (%) | |
| 1541030408 | 1 | < 0.1% | |
| 1541030410 | 1 | < 0.1% | |
| 1541030412 | 1 | < 0.1% | |
| 1541030414 | 1 | < 0.1% | |
| 1541030423 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1541548799 | 7 | < 0.1% | |
| 1541548798 | 11 | < 0.1% | |
| 1541548797 | 6 | < 0.1% | |
| 1541548796 | 8 | < 0.1% | |
| 1541548795 | 17 | < 0.1% |
step
Real number (ℝ≥0)
| Distinct count | 3522 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 75.58612186587428 |
|---|---|
| Minimum | 1 |
| Maximum | 3522 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 121.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 8 |
| median | 28 |
| Q3 | 81 |
| 95-th percentile | 304 |
| Maximum | 3522 |
| Range | 3521 |
| Interquartile range (IQR) | 73 |
Descriptive statistics
| Standard deviation | 144.5524398 |
|---|---|
| Coefficient of variation (CV) | 1.912420378 |
| Kurtosis | 58.68380695 |
| Mean | 75.58612187 |
| Median Absolute Deviation (MAD) | 78.98644605 |
| Skewness | 5.879137135 |
| Sum | 1204313075 |
| Variance | 20895.40785 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.0000e+00 1.5000e+00 2.5000e+00 3.5000e+00 4.5000e+00 ... 2.3405e+03 2.5185e+03 2.6485e+03 2.8165e+03 3.5220e+03], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 1 | 910732 | 5.7% | |
| 2 | 712452 | 4.5% | |
| 3 | 584269 | 3.7% | |
| 4 | 490674 | 3.1% | |
| 5 | 426992 | 2.7% | |
| 6 | 377810 | 2.4% | |
| 7 | 342413 | 2.1% | |
| 8 | 314490 | 2.0% | |
| 9 | 292139 | 1.8% | |
| 10 | 274139 | 1.7% | |
| Other values (3512) | 11206882 | 70.3% |
| Value | Count | Frequency (%) | |
| 1 | 910732 | 5.7% | |
| 2 | 712452 | 4.5% | |
| 3 | 584269 | 3.7% | |
| 4 | 490674 | 3.1% | |
| 5 | 426992 | 2.7% |
| Value | Count | Frequency (%) | |
| 3522 | 1 | < 0.1% | |
| 3521 | 1 | < 0.1% | |
| 3520 | 1 | < 0.1% | |
| 3519 | 1 | < 0.1% | |
| 3518 | 1 | < 0.1% |
action_type
Categorical
| Distinct count | 10 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 121.6 MiB |
| interaction item image | |
|---|---|
| clickout item | 1586586 |
| filter selection | 695917 |
| search for destination | 403066 |
| change of sort order | 400584 |
| Other values (5) | 986089 |
| Value | Count | Frequency (%) | |
| interaction item image | 11860750 | 74.4% | |
| clickout item | 1586586 | 10.0% | |
| filter selection | 695917 | 4.4% | |
| search for destination | 403066 | 2.5% | |
| change of sort order | 400584 | 2.5% | |
| interaction item info | 285402 | 1.8% | |
| interaction item rating | 217246 | 1.4% | |
| interaction item deals | 193794 | 1.2% | |
| search for item | 152203 | 1.0% | |
| search for poi | 137444 | 0.9% |
Length
| Max length | 23 |
|---|---|
| Mean length | 20.65128452 |
| Min length | 13 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 18 | 94.7% | |
| Space_Separator | 1 | 5.3% |
| Value | Count | Frequency (%) | |
| Latin | 18 | 94.7% | |
| Common | 1 | 5.3% |
| Value | Count | Frequency (%) | |
| ASCII | 19 | 100.0% |
| Distinct count | 400277 |
|---|---|
| Unique (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 121.6 MiB |
| interaction sort button | 235027 |
|---|---|
| Sort by Price | 78922 |
| price only | 78863 |
| Hotel | 58039 |
| 5 Star | 46193 |
| Other values (400272) |
| Value | Count | Frequency (%) | |
| interaction sort button | 235027 | 1.5% | |
| Sort by Price | 78922 | 0.5% | |
| price only | 78863 | 0.5% | |
| Hotel | 58039 | 0.4% | |
| 5 Star | 46193 | 0.3% | |
| Best Value | 43319 | 0.3% | |
| price and recommended | 43317 | 0.3% | |
| 4 Star | 42625 | 0.3% | |
| Resort | 42343 | 0.3% | |
| Hostal (ES) | 35028 | 0.2% | |
| Other values (400267) | 15229316 | 95.6% |
Length
| Max length | 150 |
|---|---|
| Mean length | 7.262445999 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 121 | 54.5% | |
| Uppercase_Letter | 67 | 30.2% | |
| Decimal_Number | 10 | 4.5% | |
| Other_Letter | 7 | 3.2% | |
| Other_Punctuation | 6 | 2.7% | |
| Space_Separator | 2 | 0.9% | |
| Dash_Punctuation | 2 | 0.9% | |
| Close_Punctuation | 1 | 0.5% | |
| Initial_Punctuation | 1 | 0.5% | |
| Final_Punctuation | 1 | 0.5% | |
| Other values (4) | 4 | 1.8% |
| Value | Count | Frequency (%) | |
| Latin | 157 | 70.7% | |
| Common | 27 | 12.2% | |
| Cyrillic | 23 | 10.4% | |
| Greek | 10 | 4.5% | |
| Han | 5 | 2.3% |
| Value | Count | Frequency (%) | |
| ASCII | 74 | 64.9% | |
| Cyrillic | 23 | 20.2% | |
| Latin Ext Additional | 8 | 7.0% | |
| CJK | 5 | 4.4% | |
| Punctuation | 3 | 2.6% | |
| IPA Ext | 1 | 0.9% |
| Distinct count | 55 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 121.6 MiB |
| BR | |
|---|---|
| US | 1627520 |
| DE | 1001105 |
| UK | 918900 |
| MX | 833785 |
| Other values (50) |
| Value | Count | Frequency (%) | |
| BR | 2634304 | 16.5% | |
| US | 1627520 | 10.2% | |
| DE | 1001105 | 6.3% | |
| UK | 918900 | 5.8% | |
| MX | 833785 | 5.2% | |
| IN | 679747 | 4.3% | |
| AU | 595003 | 3.7% | |
| TR | 564271 | 3.5% | |
| JP | 547480 | 3.4% | |
| IT | 527046 | 3.3% | |
| Other values (45) | 6003831 | 37.7% |
Length
| Max length | 2 |
|---|---|
| Mean length | 2 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 25 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 25 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 25 | 100.0% |
| Distinct count | 34752 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 121.6 MiB |
| London, United Kingdom | 326255 |
|---|---|
| Paris, France | 262060 |
| Istanbul, Turkey | 230458 |
| New York, USA | 223320 |
| Rio de Janeiro, Brazil | 161973 |
| Other values (34747) |
| Value | Count | Frequency (%) | |
| London, United Kingdom | 326255 | 2.0% | |
| Paris, France | 262060 | 1.6% | |
| Istanbul, Turkey | 230458 | 1.4% | |
| New York, USA | 223320 | 1.4% | |
| Rio de Janeiro, Brazil | 161973 | 1.0% | |
| Amsterdam, Netherlands | 150529 | 0.9% | |
| Rome, Italy | 146798 | 0.9% | |
| Cancun, Mexico | 146004 | 0.9% | |
| Tokyo, Japan | 141557 | 0.9% | |
| Berlin, Germany | 134252 | 0.8% | |
| Other values (34742) | 14009786 | 87.9% |
Length
| Max length | 55 |
|---|---|
| Mean length | 17.62062355 |
| Min length | 8 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 97 | 56.4% | |
| Uppercase_Letter | 58 | 33.7% | |
| Other_Punctuation | 5 | 2.9% | |
| Decimal_Number | 5 | 2.9% | |
| Space_Separator | 2 | 1.2% | |
| Dash_Punctuation | 2 | 1.2% | |
| Final_Punctuation | 1 | 0.6% | |
| Modifier_Symbol | 1 | 0.6% | |
| Modifier_Letter | 1 | 0.6% |
| Value | Count | Frequency (%) | |
| Latin | 154 | 89.5% | |
| Common | 17 | 9.9% | |
| Greek | 1 | 0.6% |
| Value | Count | Frequency (%) | |
| ASCII | 64 | 87.7% | |
| Latin Ext Additional | 6 | 8.2% | |
| Punctuation | 2 | 2.7% | |
| Modifier Letters | 1 | 1.4% |
device
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 121.6 MiB |
| mobile | |
|---|---|
| desktop | |
| tablet | 1285516 |
| Value | Count | Frequency (%) | |
| mobile | 7643538 | 48.0% | |
| desktop | 7003938 | 44.0% | |
| tablet | 1285516 | 8.1% |
Length
| Max length | 7 |
|---|---|
| Mean length | 6.439587116 |
| Min length | 6 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 12 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 12 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 12 | 100.0% |
| Distinct count | 61980 |
|---|---|
| Unique (%) | 5.4% |
| Missing | 14779880 |
| Missing (%) | 92.8% |
| Memory size | 121.6 MiB |
| Sort by Price | |
|---|---|
| Focus on Distance | 80143 |
| Best Value | 74923 |
| 5 Star|Hotel|Motel|Resort|Hostal (ES) | 26106 |
| Sort By Distance | 21754 |
| Other values (61975) |
| Value | Count | Frequency (%) | |
| Sort by Price | 159376 | 1.0% | |
| Focus on Distance | 80143 | 0.5% | |
| Best Value | 74923 | 0.5% | |
| 5 Star|Hotel|Motel|Resort|Hostal (ES) | 26106 | 0.2% | |
| Sort By Distance | 21754 | 0.1% | |
| 5 Star|4 Star|Hotel|Motel|Resort|Hostal (ES) | 21718 | 0.1% | |
| 5 Star|4 Star|3 Star|Hotel|Motel|Resort|Hostal (ES) | 17854 | 0.1% | |
| Excellent Rating | 17506 | 0.1% | |
| Very Good Rating | 16384 | 0.1% | |
| Focus on Rating | 14673 | 0.1% | |
| Other values (61970) | 702675 | 4.4% | |
| (Missing) | 14779880 | 92.8% |
Length
| Max length | 259 |
|---|---|
| Mean length | 5.044709807 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 25 | 39.7% | |
| Uppercase_Letter | 23 | 36.5% | |
| Decimal_Number | 6 | 9.5% | |
| Other_Punctuation | 3 | 4.8% | |
| Math_Symbol | 2 | 3.2% | |
| Space_Separator | 1 | 1.6% | |
| Dash_Punctuation | 1 | 1.6% | |
| Close_Punctuation | 1 | 1.6% | |
| Open_Punctuation | 1 | 1.6% |
| Value | Count | Frequency (%) | |
| Latin | 48 | 76.2% | |
| Common | 15 | 23.8% |
| Value | Count | Frequency (%) | |
| ASCII | 63 | 100.0% |
| Distinct count | 1059891 |
|---|---|
| Unique (%) | 66.8% |
| Missing | 14346406 |
| Missing (%) | 90.0% |
| Memory size | 121.6 MiB |
| 1668573 | 59 |
|---|---|
| 2262316 | 53 |
| 2717657 | 53 |
| 128010|275217|31740|265842|266112|120896|4329278|5737070|265822|1289285|1501497|1568459|4733508|32102|5705516|3922410|3375350|266047|4523644|266137|120381|120915|270587|120385|263887 | 48 |
| 2343986|8288974 | 48 |
| Other values (1059886) |
| Value | Count | Frequency (%) | |
| 1668573 | 59 | < 0.1% | |
| 2262316 | 53 | < 0.1% | |
| 2717657 | 53 | < 0.1% | |
| 128010|275217|31740|265842|266112|120896|4329278|5737070|265822|1289285|1501497|1568459|4733508|32102|5705516|3922410|3375350|266047|4523644|266137|120381|120915|270587|120385|263887 | 48 | < 0.1% | |
| 2343986|8288974 | 48 | < 0.1% | |
| 20897|20669|20677|20758|20766|1220350|20736|20674|20789|20737|2669764|6286688|20662|20752|1217792|20745|7127542|20680|84987|20712|20832|945823|20837|81466|20725 | 42 | < 0.1% | |
| 2552508|2628141|320791|521526|383116|521261|2212958|3135658|5963374|1979851|1837395|6819172|3201066|521546|4143828|5922370|438096|3830270|1131877|8866872|1983739|1837423|9344330|2103318|2617715 | 41 | < 0.1% | |
| 9377362|2667270|2663095|1158907|1535635|2788752|10511626|1018283|4989528|10601506|3196195|10632766|2702696|1403190|2704220|4488722|4846662|10644372|7191226|10061308|4712380|1018260|4525298|3583390|4777248 | 38 | < 0.1% | |
| 128010|275217|31740|265842|266112|120896|5737070|4329278|265822|1289285|1501497|1568459|4733508|3922410|32102|5705516|3375350|266047|4523644|266137|120381|120915|270587|120385|263887 | 36 | < 0.1% | |
| 4182500|4431788|6071334 | 35 | < 0.1% | |
| Other values (1059881) | 1586133 | 10.0% | |
| (Missing) | 14346406 | 90.0% |
Length
| Max length | 224 |
|---|---|
| Mean length | 19.54954022 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 76.9% | |
| Lowercase_Letter | 2 | 15.4% | |
| Math_Symbol | 1 | 7.7% |
| Value | Count | Frequency (%) | |
| Common | 11 | 84.6% | |
| Latin | 2 | 15.4% |
| Value | Count | Frequency (%) | |
| ASCII | 13 | 100.0% |
| Distinct count | 1066775 |
|---|---|
| Unique (%) | 67.2% |
| Missing | 14346406 |
| Missing (%) | 90.0% |
| Memory size | 121.6 MiB |
| 26 | 72 |
|---|---|
| 27 | 68 |
| 45 | 68 |
| 18 | 65 |
| 30 | 59 |
| Other values (1066770) |
| Value | Count | Frequency (%) | |
| 26 | 72 | < 0.1% | |
| 27 | 68 | < 0.1% | |
| 45 | 68 | < 0.1% | |
| 18 | 65 | < 0.1% | |
| 30 | 59 | < 0.1% | |
| 32 | 51 | < 0.1% | |
| 34 | 51 | < 0.1% | |
| 57 | 51 | < 0.1% | |
| 28 | 50 | < 0.1% | |
| 140 | 49 | < 0.1% | |
| Other values (1066765) | 1586002 | 10.0% | |
| (Missing) | 14346406 | 90.0% |
Length
| Max length | 124 |
|---|---|
| Mean length | 10.35702077 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 76.9% | |
| Lowercase_Letter | 2 | 15.4% | |
| Math_Symbol | 1 | 7.7% |
| Value | Count | Frequency (%) | |
| Common | 11 | 84.6% | |
| Latin | 2 | 15.4% |
| Value | Count | Frequency (%) | |
| ASCII | 13 | 100.0% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
First rows
| user_id | session_id | timestamp | step | action_type | reference | platform | city | device | current_filters | impressions | prices | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037460 | 1 | search for poi | Newtown | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
| 1 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037522 | 2 | interaction item image | 666856 | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
| 2 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037522 | 3 | interaction item image | 666856 | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
| 3 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037532 | 4 | interaction item image | 666856 | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
| 4 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037532 | 5 | interaction item image | 109038 | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
| 5 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037532 | 6 | interaction item image | 666856 | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
| 6 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037532 | 7 | interaction item image | 109038 | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
| 7 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037532 | 8 | interaction item image | 666856 | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
| 8 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037542 | 9 | interaction item image | 109038 | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
| 9 | 00RL8Z82B2Z1 | aff3928535f48 | 1541037542 | 10 | interaction item image | 109038 | AU | Sydney, Australia | mobile | NaN | NaN | NaN |
Last rows
| user_id | session_id | timestamp | step | action_type | reference | platform | city | device | current_filters | impressions | prices | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15932982 | ZYNMLE3MV3LK | 62728015bec05 | 1541544480 | 10 | interaction item image | 6617798 | PT | Paris, France | desktop | NaN | NaN | NaN |
| 15932983 | ZYNMLE3MV3LK | 62728015bec05 | 1541544480 | 11 | interaction item image | 6617798 | PT | Paris, France | desktop | NaN | NaN | NaN |
| 15932984 | ZYNMLE3MV3LK | 62728015bec05 | 1541544480 | 12 | interaction item image | 6617798 | PT | Paris, France | desktop | NaN | NaN | NaN |
| 15932985 | ZYNMLE3MV3LK | 62728015bec05 | 1541544480 | 13 | interaction item image | 6617798 | PT | Paris, France | desktop | NaN | NaN | NaN |
| 15932986 | ZYNMLE3MV3LK | 62728015bec05 | 1541544480 | 14 | interaction item image | 6617798 | PT | Paris, France | desktop | NaN | NaN | NaN |
| 15932987 | ZYNMLE3MV3LK | 62728015bec05 | 1541544490 | 15 | interaction item image | 6617798 | PT | Paris, France | desktop | NaN | NaN | NaN |
| 15932988 | ZYNMLE3MV3LK | 62728015bec05 | 1541544491 | 16 | clickout item | 6617798 | PT | Paris, France | desktop | Focus on Distance | 6617798|1263420|9567886|1161323|149768|1890735|48766|49244|18208|129443|6002460|3213646|48511|49976|50117|3503750|153375|49847|4342488|12260|2712342|48497|11933|1714483|1236687 | 58|96|55|75|90|60|233|104|150|145|328|207|150|181|135|99|495|170|118|259|73|169|87|485|171 |
| 15932989 | ZYNMLE3MV3LK | 62728015bec05 | 1541544540 | 17 | clickout item | 2712342 | PT | Paris, France | desktop | Focus on Distance | 6617798|1263420|9567886|1161323|149768|1890735|48766|49244|18208|129443|6002460|3213646|48511|49976|50117|3503750|153375|49847|4342488|12260|2712342|48497|11933|1714483|1236687 | 58|96|55|75|90|60|233|104|150|145|328|207|150|181|135|99|495|170|118|259|73|169|87|485|171 |
| 15932990 | ZYNMLE3MV3LK | 62728015bec05 | 1541544967 | 18 | change of sort order | interaction sort button | PT | Paris, France | desktop | NaN | NaN | NaN |
| 15932991 | ZYNMLE3MV3LK | 62728015bec05 | 1541544973 | 19 | clickout item | 1161323 | PT | Paris, France | desktop | Focus on Distance | 6617798|1263420|9567886|1161323|149768|1890735|48766|49244|18208|129443|6002460|3213646|48511|49976|50117|3503750|153375|49847|4342488|12260|2712342|48497|11933|1714483|1236687 | 58|96|55|75|90|60|233|104|150|145|328|207|150|181|135|99|495|170|118|259|73|169|87|485|171 |